Issues Related to Sampling Techniques for Network Traffic Dataset

نویسندگان

  • Raman Singh
  • Harish Kumar
  • R. K. Singla
چکیده

Network traffic data is huge, varying and imbalanced because various classes are not equally distributed. Machine learning (ML) algorithms for traffic analysis uses the samples from this data to recommend the actions to be taken by the network administrators. Due to imbalances in dataset, machine learning algorithms may give biased or false results leading to serious degradation in performance of these algorithms. Since the network dataset is huge, during training machine learning algorithm takes more time and hence sampling should be used to reduce the training time. But using sampling may cause loss of information which should be taken care off while obtaining the samples. In this paper various sampling techniques have been analysed for loss of information and imbalances during sampling of network traffic data. Data set is collected from the Panjab University network. Various parameters like missing classes in samples, probability of sampling of the different instances have been considered for comparison.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sampling Based Approaches to Handle Imbalances in Network Traffic Dataset for Machine Learning Techniques

Network traffic data is huge, varying and imbalanced because various classes are not equally distributed. Machine learning (ML) algorithms for traffic analysis uses the samples from this data to recommend the actions to be taken by the network administrators as well as training. Due to imbalances in dataset, it is difficult to train machine learning algorithms for traffic analysis and these may...

متن کامل

Behavioral Analysis of Traffic Flow for an Effective Network Traffic Identification

Fast and accurate network traffic identification is becoming essential for network management, high quality of service control and early detection of network traffic abnormalities. Techniques based on statistical features of packet flows have recently become popular for network classification due to the limitations of traditional port and payload based methods. In this paper, we propose a metho...

متن کامل

Performance of OpenDPI in Identifying Sampled Network Traffic

The identification of the nature of the traffic flowing through a TCP/IP network is a relevant target for traffic engineering and security related tasks. Despite the privacy concerns it arises, Deep Packet Inspection (DPI) is one of the most successful current techniques. Nevertheless, the performance of DPI is strongly limited by computational issues related to the huge amount of data it needs...

متن کامل

Convergence Optimization of Backpropagation Artificial Neural Network Used for Dichotomous Classification of Intrusion Detection Dataset

There are distinguished two categories of intrusion detection approaches utilizing machine learning according to type of input data. The first one represents network intrusion detection techniques which consider only data captured in network traffic. The second one represents general intrusion detection techniques which intake all possible data sources including host-based features as well as n...

متن کامل

DTMP: Energy Consumption Reduction in Body Area Networks Using a Dynamic Traffic Management Protocol

Advances in medical sciences with other fields of science and technology is closely casual profound mutations in different branches of science and methods for providing medical services affect the lives of its descriptor. Wireless Body Area Network (WBAN) represents such a leap. Those networks excite new branches in the world of telemedicine. Small wireless sensors, to be quite precise and calc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013